|
What is DjVu, and what are the secrets behind DjVu's superior performance?
Over 90 percent of the information in the world is still on paper.
Many of those paper documents include color graphics and/or photographs
that represent significant invested value. And almost none of that
rich content is on the Internet.
That's because scanning such documents and getting them onto a
Web site has been problematic at best. At the high resolution necessary
to ensure the readability of the text and to preserve the quality of
the images, file sizes become far too bulky for acceptable download speed.
Reducing resolution to achieve satisfactory download speed means forfeiting
quality and legibility. Conventional web formats such as JPEG, GIF, and PNG
produce prohibitively large image files at decent resolution. As a result,
Web site content developers have been largely unable to leverage existing
printed materials.
Information that was previously trapped in hard copy form can now
be made available to wide audience.
Research institutions, libraries, and government agencies can
give access to their archives. Companies can distribute internal
documents on their intranets.
The commercialization of DjVu is handled by Seattle-based
LizardTech Inc.
in partnership with AT&T Labs. DjVu is an open standard.
The file format specification, as well as an open source
implementations of the decoder (and part of the encoder)
are available.
For color document images
that contain both text and pictures, DjVu files are typically 5
to 10 times smaller than JPEG at similar quality. For black-and-white
pages, DjVu files are typically 10 to 20 times smaller than JPEG
and five times smaller than GIF. DjVu files are also about 3 to 8
times smaller than black and white PDF files produced from scanned
documents (scanned documents in color are impractical in PDF).
In addition to scanned documents,
DjVu can also be applied to documents produced electronically in
formats such as Adobe's PostScript or PDF. In that case, the file
sizes are between 15 to 20KB per page at 300 DPI.
The DjVu plug-in is available for standard Web browsers on various platforms.
The DjVu plug-in allows for easy panning and zooming of document images. A
unique on the fly decompression technology allows images that normally
require 25MB of RAM to be decompressed to require only 2MB of RAM.
The DjVu format is progressive.
Users get an initial version of the page very quickly, and the visual
quality of the page progressively improves as more bits arrive.
For example, the text of a typical magazine page would appear in
just three seconds over a 56Kbps modem connection. In another second
or two, the first versions of the pictures and backgrounds will
appear. Then, after a few more seconds, the final full-quality version
of the page is completed.
One of the main technologies behind
DjVu is the ability to separate an image into a
background layer (i.e., paper texture and pictures) and foreground
layer (text and line drawings). Traditional image compression techniques
are fine for simple photographs, but they drastically degrade sharp
color transitions between adjacent highly contrasted areas -
which is why they render type so poorly. By separating the text from the
backgrounds, DjVu can keep the text at high resolution (thereby preserving
the sharp edges and maximizing legibility), while at the same time
compressing the backgrounds and pictures at lower resolution with
a wavelet-based compression technique.
|